-
Notifications
You must be signed in to change notification settings - Fork 8.3k
generalize async notification and add queued operation manager #22853
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
All checks are passing now. checkpatch (informational only, not a failure)Tip: The bot edits this comment instead of posting a new one, so you can check the comment's history to see earlier messages. |
cd0077c to
e1357c4
Compare
nordic-krch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few comments. I will continue tomorrow.
e1357c4 to
4009e56
Compare
| if (rv < 0) { | ||
| return rv; | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably in majority of cases there will single operation only, it would be good to optimized this. I've been thinking about two things:
- using only
currentto indicate that module is busy, finalizing stage can be encoded by using invalid address (e.g.UINTPTR_MAX) - when it is used then process can be started immediately and locklessy (note that for that atomic.h must be extended: samples: fxos8700-hid: Fix using gpio_pin_get() #22680 :
if (atomic_cas((atomic_t *)&mgr->current, (atomic_t)NULL, (atomic_t)op)) {
mgr->vtable->process(mgr, op);
return rv;
}
I tried it (and adapt getting next op) and measured time which went from 5.5us to 3.9us on nrf52.
I was trying if it is feasible (nordic-krch@47371da)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I've addressed all the concerns except this one. (Aside: I think you have the wrong link for an extension to atomic.h.)
I'm not taking this suggestion, as I think it adds significant complexity (magic non-pointer values stored in pointers; a new synchronization mechanism) with little real value (1.5 us per submitted operation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not taking this suggestion, as I think it adds significant complexity
Here i disagree because:
- one variable to keep busy state which is simpler
- selecting next to process is simplified
- speed up (you say 1.5us, I say 30%)
- a bit less rom (150 bytes less)
| * | ||
| * This function must be invoked by services that support queued | ||
| * operations when the operation provided to them through the process | ||
| * function have been completed. It is not intended to be invoked by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it's worth stressing that this should be called at the and of completion notification because it may result with process call with NULL operation before actual action is performed. For example if process function is synchronous then finalizing may look like:
void process(struct queued_operation_manager *mgr,
struct queued_operation *op)
{
if (op) {
queued_operation_finalize(mgr, op, 0);
// do some processing
} else {
// shut down resources
}
}
In the case above process(mng, NULL) will be called in the context of queued_operation_finalize call, before actual action and that's probably unintended.
It is implicitly explained in the sentence "During the call..." but i wonder if it is enough.
|
I've been wondering about a way to use manager without priority, to avoid traversing whole list (in locked context) in each submit. E.g. use some reserved priority value which would mean just append to the list. In the manager you don't provide functions for initializing async_notify maybe then onoff should deprecate those and also expect that user will call async_notiify directly. I'm done with review. I like this api. What about splitting PR in 2 (async notify first). Manager is missing tests and will probably take longer to review. |
|
You're right that
I think that can be done, yes.
You may be right. I don't want to deprecate anything right now, because I do want to rename
I'm OK with splitting it, but don't understand the comment about missing tests. The manager has 100% line-coverage in its test. |
Were they in PR yesterday? I didn't see them yesterday, in fact they are included. |
They've been in the PR since it was first submitted. |
Never mind, i must have missed them somehow. |
4009e56 to
9f8f7e0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose to move this out of kernel/other and put it directly under doc/reference under a new category, for example doc/reference/services or whatever.
9f8f7e0 to
54e21d3
Compare
|
Latest push has no change to implementation, just moves the documentation out of the kernel area. Also rebased on current master. |
54e21d3 to
3139199
Compare
58b4dc2 to
b3a8c14
Compare
|
Rebased on top of #22419 to avoid future merge conflicts and simplify development targeting next release. |
b3a8c14 to
31061d1
Compare
include/sys/async_notify.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I think we are going way too far with name lengths in Zephyr.
I suggest we rename this to anotify, here and everywhere else where async_notify appears.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree with @carlescufi, most of the names are too long:
async_notify_generic_callback
async_notify_fetch_resul
async_notify_init_spinwait
ONOFF_SERVICE_TRANSITIONS_INITIALIZER
ASYNC_NOTIFY_METHOD_COMPLETED
ASYNC_NOTIFY_EXTENSION_MASK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a feeling this header is not namespacing stuff properly.
Shouldn't everything be prefixed with sys_?
include/sys/queued_operation.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this is too long. I suggest qdop or something similar.
|
In the absence of clear and consistent guidance naming seems to largely depend on personal taste. I prefer names that unambiguously identify the capability, and (try to) use them consistently for the related data structures, functions, and constants. If that's not the consensus primary consideration for Zephyr, I'll change it to whatever it's supposed to be. Whether that includes adding the prefix |
lib/os/queued_operation.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function should not continue when these two pointers are not equal, i.e. an attempt is made to finalize an operation that is not currently processed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's intentional; operations that are cancelled are finalized to -ECANCELED but are not the current operation. However, handling of the finalize field is incorrect in that case, in a way I thought I'd fixed....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but queued_operation_finalize() is no longer called from queued_operation_cancel() context. It is only API call now. I think in that case struct queued_operation *op should be removed from arguments list.
There is one guidance, |
|
regarding names, I also think that they are ok. I'm for the rule: use abbreviations which are common otherwise full word. Unless you want to establish new common abbreviation (like dt in zephyr/linux world). |
89007c2 to
330183e
Compare
mbolivar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few drive by typos.
330183e to
4fa0de9
Compare
|
@pabigot i've been trying to go back to the review after couple of days and noticed complexity increase. I came to the conclusion that the reason for that is the loop in
Because of that i would like to propose following change in the api: add |
|
It's not immediately obvious how adding two optional functions to the API simplifies anything, because there's a inherent complexity to notification/processing/finalization that was rediscovered as @anangl and I went through the previous comments. However you raise an important technical point: But the limitations you describe are exactly the sort of thing the onoff manager is intended to support. So the best way to deal with this may be to add an optional onoff service pointer to the manager state. (This is also an "eat your own dogfood" thing, because the set of manager capabilities were always intended to be components that could be composed to support specific services.) If the queued operation manager does not expose an onoff service, then the manager considers transitions from I think this has some promise, and will dig into it. Thanks for raising this concern. |
include/sys/queued_operation.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't it contain result? I know that it is written inside async_notify in operation but it will definitely be needed and fetching it inside callback seems strange as it is already available when this callback is invoked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's necessary. The intermediary translation function provided by the service can extract the result if passing it to the end user as a distinct argument is useful. In a probably common case where the completed operation is simply appended to another list for processing in another context it isn't needed.
I've reworked the state machine to use an optional onoff manager to handle the transitions between The management is slightly simpler now for states specific to transitioning queued operations ( Introduction of the onoff manager introduces the concept of an error in the queued operation manager. There will have to be new API to detect and react to that; it'll come soon. I plan to close this PR soon and split out the pieces related only to async notify and onoff manager to another PR which is more likely to be merged quickly. A new PR in draft format based on that rework will be added to continue work on this feature. Any comments on the changes made here will be addressed in the initial post of that draft PR. |
Full asynchronous support for APIs such as bus transactions generally require managing operations from unrelated clients. This API provides a data structure and functions to manage those operations generically, leaving the service to provide only the service-specific operation description and implementation. Signed-off-by: Peter Bigot <peter.bigot@nordicsemi.no>
| /* Pointer to an on-off service supporting this service. NULL | ||
| * if service is always-on. | ||
| */ | ||
| struct onoff_service *onoff; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not convinced that onoff here is actually needed. Onoff is to manage on-off transistions when multiple users are present. It's not the case here. At least qop_manager is not aware of that. It enables service when going from idle and shuts down when queue is empty. Imo, better approach would be to have asynchronous setup,teardown functions which results in callback with result. It might be that in specific functions implementations onoff service will be used but it's beyond this manager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm somewhat sympathetic to that position, but I'm going to stick with this approach for now.
A driving goal of the manager framework is to provide components that can be composed to solve complex problems without re-implementing things and trying to handle all the edge cases (e.g., invocation from within an interrupt handler). The onoff manager already provides all the necessary logic to safely and correctly handle an asynchronous transition between an off and on state, regardless of calling context and current underlying state. If it does something wrong, it needs to be fixed. Problems are more likely to be found and addressed in the onoff manager component, than in edge cases that don't get exercised by a specific queued operation manager callback solution.
Also, at least in the near term, the underlying services used by a queued operation manager (e.g. I2C or SPI) may continue to be used without a queued operation manager. They may also need to be able to be turned on and off as we start to inch towards functional power management. If that capability is provided through the onoff manager abstraction then it's available for both ways of using the service.
| * function have been completed. It is not intended to be invoked by | ||
| * users of a service. | ||
| * | ||
| * During the call to this function the service process function will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's no longer true. Process function is not used to indicate no operations pending
| } | ||
| } | ||
|
|
||
| int queued_operation_submit(struct queued_operation_manager *mgr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function is getting quite long. Could you split it? e.g. extract putting into list into separate function?
| u32_t state = mgr->state; | ||
|
|
||
| if (state == ST_ERROR) { | ||
| rv = -ENODEV; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interrupts are locked and function leaves
| * This function can be called as a side effect of | ||
| * queued_operation_submit() or queued_operation_finalize() to | ||
| * tell the service that a new operation needs to be | ||
| * processed, or that there are no operations left to process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no longer true
| { | ||
| bool loop = false; | ||
|
|
||
| do { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i still don't understand why you need a loop here. I only see a benefit that if process function is synchronous then process is not called recursively but this isn't a goal of this module. User is already warned in the header file that synchronous process may lead to recursive calls. We rather don't expect process to be synchronous because if it would be then there is no point of queuing operations.
Implementation is very complex now, with multiple places where interrupts are locked and unlocked in other functions. It's really hard to understand that. Especially that flow seems simple with only two forks: setup failure, new request pending after teardown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last time I incrementally refactored this based on a belief a loop wasn't necessary, it turned out that it was necessary. So I'm going to prioritize correctness. We can reconsider whether it's possible to avoid looping to react to events received during unlocked periods after the solution has been shown to work correctly and be functionally complete.
| ST_STOPPING, | ||
|
|
||
| /* Service is in an error state. */ | ||
| ST_ERROR, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems that there is no way to get out of error state. How to reinitialize the manager?
|
Superseded by #23333. |
SUPERSEDED by #23333
Addressing #22494 in support of #21538 this PR:
See documentation
at https://builds.zephyrproject.org/zephyrproject-rtos/zephyr/22853/docs/reference/index.htmlby digging into the latest results of the Documentation Github Action, finding the artifact, downloading it, uncompressing this, untarring that, and pointing your browser at the html.